Benchmarking Memory Performance with the Data Cube Operator
نویسندگان
چکیده
Data movement across computer memory hierarchy and across hosts of distributed systems is known to be a limiting factor for applications processing large data sets. We use the Data Cube Operator on an Arithmetic Data Set, called ADC, to benchmark computer capability to handle large datasets. To compute the operator we implement a parallel algorithm that computes a view from the smallest parent. The algorithm employs RB-trees to process data fitting into memory and a multi-way merge to process data residing in secondary storage. The ADC stresses all levels of memory and storage by generating some of 2 views of an Arithmetic Data Set of d-tuples described by a small number of integers. Data intensity of the ADC can be controlled by selecting the tuple parameters, the sizes of the views, and the number of generated views. We present benchmarking results of memory performance of a number of computer architectures and of a small distributed system. Based on the benchmark we build a tool which reveals a computer memory signature and allows to rank computer memory performance.
منابع مشابه
A Case for Near Memory Computation Inside the Smart Memory Cube
3D integration of solid-state memories and logic, as demonstrated by the Hybrid Memory Cube (HMC), offers major opportunities for revisiting near-memory computation and gives new hope to mitigate the power and performance losses caused by the “memory wall”. In this paper we present the first exploration steps towards design of the Smart Memory Cube (SMC), a new Processor-in-Memory (PIM) archite...
متن کاملMultidimensional cyclic graph approach: Representing a data cube without common sub-graphs
We present a new full cube computation technique and a cube storage representation approach, called the multidimensional cyclic graph (MCG) approach. The data cube relational operator has exponential complexity and therefore its materialization involves both a huge amount of memory and a substantial amount of time. Reducing the size of data cubes, without a loss of generality, thus becomes a fu...
متن کاملArithmetic Data Cube as a Data Intensive Benchmark
Data movement across computational grids and across memory hierarchy of individual grid machines is known to be a limiting factor for application involving large data sets. In this paper we introduce the Data Cube Operator on an Arithmetic Data Set which we call Arithmetic Data Cube (ADC). We propose to use the ADC to benchmark grid capabilities to handle large distributed data sets. The ADC st...
متن کاملHigh performance cluster computing with 3-D nonlinear diffusion filters
This paper deals with parallelisation and implementation aspects of PDE-based image processing models for large cluster environments with distributed memory. As an example we focus on nonlinear diffusion filtering which we discretise by means of an additive operator splitting (AOS). We start by decomposing the algorithm into small modules that shall be parallelised separately. For this purpose ...
متن کاملDesigning 3-D Nonlinear Diffusion Filters for High Performance Cluster Computing
This paper deals with parallelization and implementation aspects of PDE based image processing models for large cluster environments with distributed memory. As an example we focus on nonlinear isotropic diffusion filtering which we discretize by means of an additive operator splitting (AOS). We start by decomposing the algorithm into small modules that shall be parallelized separately. For thi...
متن کامل